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Abstract. Although serine proteases are found in all kinds of cellular organisms and 
many viruses, the classic "chymotrypsin family" (Group S1A by the 1998 Barrett 
nomenclature) has an unusual phylogenetic distribution, being especially common 
in animals, entirely absent from plants and protists, and rare among fungi. The 
distribution in Bacteria is largely restricted to the genus Streptomyces, although a 
few isolated occurrences in other bacteria have been reported. The family may be 
entirely absent from Archaea. Although more than a thousand sequences have been 
reported for enzymes of this type from animals, none of them have been from early 
diverging phyla like Porifera or Cnidaria. We now report the existence of Group 
S1A serine proteases in a sponge (phylum Porifera) and a jellyfish (phylum 
Cnidaria), making it safe to conclude that all animal groups possess these enzymes. 



The origin of the "chymotrypsin family" of serine proteases (E.C. 3.4.21.-, or Group 
S1A, Barrett et al, 1998) in animals poses a perplexing problem in phylogenetics. 
There are literally hundreds of different enzymes from this group among 
invertebrate and vertebrate animals, but members of the family are not found at all 
among plants or protists. In fungi they are restricted to a single small group of 
ascomycetes, and in Bacteria, with only a few exceptions, they are restricted to the 
genus Streptomyces. These enzymes, which are mostly extracellular and contain 
three or more disulfide bonds, are extremely abundant in both vertebrate and 
invertebrate animals. 

Although these enzymes have been reviewed extensively in the past (e.g., Lesk 
and Fordham, 1996), the issue of where and how the animal cohort of enzymes 
originated has not been specifically addressed in recent times. The current version of 
the MEROPS database contains about a thousand entries for such enzymes from 48 
different animal species. Of these, about 200 are from two dozen different 
invertebrate species. None of these, however, are from early diverging phyla like 
Porifera and Cnidaria. Given the overall sporadic occurrence of these enzymes and 
the frequent invocation of horizontal gene transfer in explanation (e.g.. Screen et al, 
2000), we felt it important to find if sponges and jellyfish possess these enzymes, and 
if so, where their sequences fall in a general phylogeny. 

We isolated total RNA (Chomczynski and Sacchi, 1987) from a sponge ( Verongia 
aurea ) and a jellyfish ( Aurelia aurita) and prepared cDNA, first in single-stranded 
form and then double-stranded (Chenchick et al, 1996). We constructed sets of 
degenerate primers based on extensive alignments of genes coding for this kind of 
protease among various invertebrate animals. These primers were used in 
conjunction with sets of universal primers with adapters that facilitated the 



isolation of the appropriate messages (Matz et al, 1999a; 1999b). Extensive use was 
made of kits from Clonetech, Invitrogen and Quiagen. PCR was conducted on a 
Perkin Elmer thermocycler. Drosophila melanogaster ds-DNA was used as a 
positive control to test the effectiveness of the primers. 

Bands of DNA of the expected lengths (600-700 nt) were extracted from agarose 
gels after electrophoresis and subcloned into appropriate vectors. Multiple 
transformants were picked and submitted for DNA sequencing at the UCSD campus 
sequencing facility. In the end, sequences corresponding to serine proteases were 
isolated from each of the three kinds of starting DNA, including an already 
identified serine protease from fruitfly (GenBank Accession AAL 49280), a trypsin- 
like enzyme from sponge, and two serine proteases from jellyfish. In the cases of the 
sponge and jellyfish, sequences were confirmed by the use of different sets of 
primers that yielded overlapping regions. The single sequence identified in sponge 
encompassed 243 amino acid codons. Of the two isolated from jellyfish, one spanned 
232 residues, but in the other the sequence did not extend beyond the 5' degenerate 
primers, and only 183 codons were obtained. The two jellyfish sequences differed at 
53 of the 185 comparable positions (71 percent identical at the amino acid level). The 
sponge and one of the jellyfish sequences were confirmed by constructing exact 
primers from them, re-isolating bands from new cDNA from separate specimens, 
and determining the sequences again. 

The three sequences have been submitted to GenBank; jellyfish serine protease 1, 
AF486486; jellyfish serine protease 2, AF486487; sponge trypsin, AF486488. An 
alignment of the sponge and jellyfish sequences with those from several other 
animals is shown in Fig. 1. 



A phylogenetic tree was constructed with these sequences and a representative 
set from other animals, both vertebrate and invertebrate, as well as several from 
bacteria and fungi (Fig. 2). Both the sponge and jellyfish sequences appear relatively 
high on the tree; certainly neither is at the animal root. Interestingly, the two 
enzymes appear in different major groups, the sponge clustering among the 
digestive enzymes most often labeled trypsins, and the two jellyfish sequences with 
the chymotrypsin-elastase group. The sponge sequence also had an aspartic acid at 
"position 189," a diagnostic residue that occurs in all known trypsins (Hannenshalli 
and Russell, 2000). In contrast, the two jellyfish sequences have an asparagine at that 
position. The sponge and jellyfish enzymes also differ with regard to the codon 
employed for the serine at the active site, the sponge having a typical trypsin 
signature with a TCT codon, whereas one jellyfish sequence has the AGT codon and 
the other AGC. When the new sequences were searched against GenBank with Blast 
(Altschul et al, 1990), the highest scoring match for the sponge sequence was an 
arthropod trypsin (48 percent identity), whereas the best match for either of the 
jellyfish sequences was rat elastase (36 pecent identity). The jellyfish enzymes are the 
first S1A serine proteases from an invertebrate animal with five disulfide bonds, all 
the cysteines of which match vertebrate elastase and chymotrypsin exactly. 

Although the S1A serine proteases from animals cover a wide variety of 
functional designations and are involved in physiological processes ranging from 
blood clotting to moulting, the largest number of reported enzymes have to do with 
feeding and digestion, including about 100 entries labeled "trypsin." Jellyfish 
(Cnidaria) have genuine organs and tissues, including a mouth and gastrovascular 
cavity. They have a well developed digestive apparatus that allows for the capture, 
swallowing and digestion of food, and we fully expected they would have digestive 
enzymes. Sponges, on the other hand, are loose colonies of cells that are filter 


feeders. They lack a true body cavity. The surrounding water and small particulate 
matter pass through pores that perforate special cells called porocytes into an inner 
chamber, from whence the flow is guided to an exit called the osculum by flagellated 
cells called choanocytes that line the chamber. Presumably much of the digestion is 
extracellular, but little is known of the process. The existence of a typical trypsin-like 
enzyme supports that notion. 

The origin of animal serine proteases remains enigmatic. If they are the result of 
vertical descent in a conventional manner, then it must be presumed that protists, 
plants and most fungi have lost these genes. With regard to the fungi, good cases 
have been made for the transfer of genes between ascomyctes and bacteria of the 
genus Streptomyces in both directions. Thus, the ascomycete Fusarium oxysporum, 
which has an animal-like trypsin (Rypniewski et al 1993), also has a respiratory 
system which looks to have been acquired from Streptomyces (Takaya et al 1998). 

But another ascomycete has a chymotrypsin which appears to have been transferred 
in the other direction, i.e., from the fungus to Streptomyces (Screen et al, 2000). We 
plan to address these apparently contradictory observations in a separate publication 
devoted strictly to phylogenetic analysis (Rojas and Doolittle, in preparation) 
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FIGURE LEGENDS 


Figure 1. Alignment of serine protease sequences from the sulfur sponge ( Verongia 
aurea) and moon jellyfish ( Aurelia aurita ) with a representative set of S1A 
sequences from vertebrate and invertebrate animals. Cysteine residues are colored 
yellow, except for two in thejellyfish enzyme that coincide exactly with vertebrate 
elastases and chymotrypsins but which have not been found elsewhere among the 
invertebrate animals. 

Figure 2. Phylogenetic tree constructed from 34 SI A serine protease sequences 
including trypsin from the sulfur sponge ( Verongia aurea) and a serine protease 
from a moon jellyfish ( Aurelia aurita) and a representative set of serine protease 
sequences from vertebrate and invertebrate animals, as well as three from fungi and 
one from a bacteriu. A single type A1E enzyme (alkaline serine protease from 
Streptomyces sp.) was used as an outlier for rooting purposes. The tree was made by 
the method of Feng and Doolittle (1996). 
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